Streaming Normalization: Towards Simpler and More Biologically-plausible Normalizations for Online and Recurrent Learning
نویسندگان
چکیده
We systematically explored a spectrum of normalization algorithms related to Batch Normalization (BN) and propose a generalized formulation that simultaneously solves two major limitations of BN: (1) online learning and (2) recurrent learning. Our proposal is simpler and more biologically-plausible. Unlike previous approaches, our technique can be applied out of the box to all learning scenarios (e.g., online learning, batch learning, fully-connected, convolutional, feedforward, recurrent and mixed — recurrent and convolutional) and compare favorably with existing approaches. We also propose Lp Normalization for normalizing by different orders of statistical moments. In particular, L1 normalization is well-performing, simple to implement, fast to compute, more biologically-plausible and thus ideal for GPU or hardware implementations. This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF 1231216. 1 ar X iv :1 61 0. 06 16 0v 1 [ cs .L G ] 1 9 O ct 2 01 6 Approach FF & FC FF & Conv Rec & FC Rec & Conv Online Learning Small Batch All Combined Original Batch Normalization(BN) 3 3 7 7 7 Suboptimal 7 Time-specific BN 3 3 Limited Limited 7 Suboptimal 7 Layer Normalization 3 7* 3 7* 3 3 7* Streaming Normalization 3 3 3 3 3 3 3 Table 1: An overview of normalization techiques for different tasks. 3: works well. 7: does not work well. FF: Feedforward. Rec: Recurrent. FC: Fully-connected. Conv: convolutional. Limited: time-specific BN requires recording normalization statistics for each timestep and thus may not generalize to novel sequence length. *Layer normalization does not fail on these tasks but perform significantly worse than the best approaches.
منابع مشابه
Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features
Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...
متن کاملBiologically plausible learning in recurrent neural networks for flexible cognitive tasks
Neural activity during cognitive tasks exhibits complex dynamics that flexibly encode task-relevant variables. Recurrent neural networks operating in the near-chaotic regime, which spontaneously generate rich dynamics, have been proposed as a model of cortical computation during cognitive tasks. However, existing methods for training these networks are either biologically implausible, and/or re...
متن کاملRecent advances in efficient learning of recurrent networks
Recurrent neural networks (RNNs) carry the promise of implementing efficient and biologically plausible signal processing. They both are optimally suited for a wide area of applications when dealing with spatiotemporal data or causalities and provide explanation of cognitive phenomena of the human brain. Recently, a few new fundamental paradigms connected to RNNs have been developed which allow...
متن کاملLess is More: Similarity of Time Series under Linear Transformations
When comparing time series, z-normalization preprocessing and dynamic time warping (DTW) distance became almost standard procedure. This paper makes a point against carelessly using this setup by discussing implications and alternatives. A (conceptually) simpler distance measure is proposed that allows for a linear transformation of amplitude and time only, but is also open for other normalizat...
متن کاملRecurrent Neural Networks For Blind Separation of Sources
In this paper, fully connected recurrent neural networks are investigated for blind separation of sources. For these networks, a new class of unsupervised on-line learning algorithms are proposed. These algorithms are the generalization of the Hebbian/anti-Hebbian rule. They are not only biologically plausible but also theoretically sound. An important property of these algorithms is that the p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1610.06160 شماره
صفحات -
تاریخ انتشار 2016